The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uniform Assymptotics in the Average Continuous Control of Piecewise Deterministic Markov Processes : Vanishing Approach

This result has been generalized by Feller (cf. [12], XIII.5) to the case of uncontrolled deterministic dynamics in continuous time, [2] to deterministic controlled dynamics, etc. A further generalization (cf. [19]) allows the limit value function with respect to a system governed by controlled deterministic dynamics to depend on the initial data. In the Brownian diffusion setting, similar resu...

متن کامل

Piecewise Deterministic Markov Processes for Continuous-Time Monte Carlo

Recently there have been conceptually new developments in Monte Carlo methods through the introduction of new MCMC and sequential Monte Carlo (SMC) algorithms which are based on continuous-time, rather than discrete-time, Markov processes. This has led to some fundamentally new Monte Carlo algorithms which can be used to sample from, say, a posterior distribution. Interestingly, continuous-time...

متن کامل

The Policy Iteration Algorithm for Average Reward Markov Decision Processes with General State Space

The average cost optimal control problem is addressed for Markov decision processes with unbounded cost. It is found that the policy iteration algorithm generates a sequence of policies which are c-regular (a strong stability condition), where c is the cost function under consideration. This result only requires the existence of an initial c-regular policy and an irreducibility condition on the...

متن کامل

Policy Iteration for Continuous-Time Average Reward Markov Decision Processes in Polish Spaces

and Applied Analysis 3 ii A is an action space, which is also supposed to be a Polish space, andA x is a Borel set which denotes the set of available actions at state x ∈ S. The set K : { x, a : x ∈ S, a ∈ A x } is assumed to be a Borel subset of S ×A. iii q · | x, a denotes the transition rates, and they are supposed to satisfy the following properties: for each x, a ∈ K and D ∈ B S , Q1 D → q...

متن کامل

Numerical methods for optimal control of piecewise deterministic Markov processes

Scientific Research context: In 1980, M.H.A. Davis [1] introduced in probability theory Piecewise Deterministic Markov Processes (PDMP) as a general class of models suitable for formulating optimization problems in queuing and inventory systems, maintenance-replacement models, investment scheduling and many other areas of operation research. In the continuous-time context, stochastic control th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied Mathematics & Optimization

سال: 2010

ISSN: 0095-4616,1432-0606

DOI: 10.1007/s00245-010-9099-4